Automated method and apparatus for processor thermal validation

ABSTRACT

A method and corresponding software for automatically validating a computer platform thermal solution. An application program is employed to selectively execute thermal stress code to cause the platform&#39;s processor to dissipate an amount of power corresponding to a predetermined value, such as a thermal design power dissipation value specified by the processor&#39;s manufacturer. In one embodiment, tests are performed while operating at this power dissipation level to determine if a thermal overload condition exists, which may be determined by the processor&#39;s temperature, an indication that the processor is throttled, or a signal provided by the processor indicating the processor has detected a thermal overload condition. In another embodiment, a thermal resistance value is calculated based on the processor power dissipation, the temperature of the processor, and the ambient temperature of the test environment. In one embodiment the entire validation process is automatically performed by the application program without requiring any extraneous test equipment or temperature probes.

FIELD OF THE INVENTION

[0001] The field of invention relates generally to thermal validation ofintegrated circuits and, more specifically but not exclusively relatesto a method and apparatus for automatically performing thermalvalidation of microprocessors.

BACKGROUND INFORMATION

[0002] Excess heat is one of the primary causes of failure formicroprocessors. Generally, the life of a microprocessor is a functionof the thermal load applied during its use. Excessive heat results ininternal breakdown of the processor circuits, eventually resulting infailure. With the introduction of processors with sub-micron circuitelements, ever-increasing transistor count (e.g., 42 million transistorsfor an Intel Pentium 4™ processor) and operating speed, the problem ofpreventing processor failure due to excess heat is exacerbated.

[0003] The objective of thermal management is to ensure that thetemperature of all components in a system are maintained with theirfunctional temperature range. Within this temperature range, acomponent, and in particular its electrical circuits, is expected tomeet its specified performance. Operation outside of the functionaltemperature range can degrade system performance, cause logic errors, orcause component and/or system damage. Temperature exceeding the maximumoperating limit of a component may result in irreversible changes in theoperating characteristics of the component.

[0004] A common way to verify a thermal solution for a particularplatform/processor begins with thermal design parameters for theprocessor type. Generally, a processor produces a baseline amount ofheat by simply being powered (i.e., when in a sleeping state), and avariable amount of heat that is a function of the processing loadencountered during operation. Other factors include the operatingfrequency, and structural parameters, such as circuit line width anddensity. Notably, different processors of the same design may exhibitsignificantly different thermal characteristics. In order to ensureprocessor longevity, the processor manufacture publishes various thermaldesign parameters that are derived from an extensive statistical-basedtesting of each processor type. For example, thermal design parameterssuch an overall minimum heat transfer coefficient, maximum temperature,and thermal design power ratings are specified by the manufacturer for aparticular processor model.

[0005] Based (generally, at least in part) on these thermal designparameters, Engineers for system integrators (e.g., a original equipmentmanufacturer such as Hewlett-Packard, Dell, Compaq, IBM, Toshiba,Gateway, etc.) use these thermal design parameters to verify that theirplatform's thermal solution will provide sufficient cooling to ensureprocessor longevity. While this is generally not as much of an issue fordesktop computer systems, which typically provide thermal solutionshaving large cooling margins, such large cooling margins are notrealizable for laptop and notebook computers. Accordingly, it isgenerally necessary to verify the thermal design via testing, preferablyusing a statistically significant number of test samples to account forthe variance in processor power. Under conventional testing techniques,this typically requires the use of various external instrumentation andthermal test components, including resistive thermal devices (RTDs) suchas thermocouples or thermisters to measure the external processortemperature, electronic test equipment to measure and record the testresults, etc. This type of thermal solution verification testing is bothexpensive and time-consuming.

BRIEF DESCRIPTION OF THE DRAWINGS

[0006] The foregoing aspects and many of the attendant advantages ofthis invention will become more readily appreciated as the same becomesbetter understood by reference to the following detailed description,when taken in conjunction with the accompanying drawings, wherein likereference numerals refer to like parts throughout the various viewsunless otherwise specified:

[0007]FIG. 1 is a schematic diagram illustrating a portion of a thermalcontrol circuit contain on a processor that may be employed in computerplatforms for which thermal validation testing may be performed inaccordance with embodiments of the invention;

[0008]FIG. 2 is a schematic diagram illustrating a processor throttlingscheme corresponding to a condition in which a thermal overloadcondition has been detected;

[0009]FIG. 3 is a diagram illustrating a statistical distribution ofapplications vs. a relative amount of power a processor dissipates whenexecuting the applications;

[0010]FIG. 4 is a diagram illustrating a statistical distribution ofpower dissipated by processors corresponding a common type of processorwhen executing a common set of thermal stress code instructions designedto produce a predetermined level of processor power dissipation.

[0011] FIGS. 5A-5C collectively comprise a flowchart illustrating logicand operations performed by an application program that is used tovalidate a platform thermal solution in accordance with one embodimentof the invention;

[0012]FIG. 6 is a is schematic diagram illustrating details of a voltageregulator that includes a built-in power measurement feature andillustrating further details of the thermal control circuit of FIG. 1;

[0013]FIG. 7 is a schematic diagram of a laptop computer illustrative ofcomputer platforms having thermal solutions that may be validated viaembodiments of the invention, and details of one embodiment in which acurrent processor power dissipation is determined through measuring thetotal amount of power consumed by the laptop computer; and

[0014]FIG. 8 is a schematic diagram illustrating a software architecturefor an application program and corresponding thermal stress code thatmay be executed on a processor to validate a platform's thermal solutionin accordance with one embodiment of the invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0015] Embodiments of methods and apparatus for automatically performingprocessor thermal validation are described herein. In the followingdescription, numerous specific details are set forth to provide athorough understanding of embodiments of the invention. One skilled inthe relevant art will recognize, however, that the invention can bepracticed without one or more of the specific details, or with othermethods, components, materials, etc. In other instances, well-knownstructures, materials, or operations are not shown or described indetail to avoid obscuring aspects of the invention.

[0016] Reference throughout this specification to “one embodiment” or“an embodiment” means that a particular feature, structure, orcharacteristic described in connection with the embodiment is includedin at least one embodiment of the present invention. Thus, theappearances of the phrases “in one embodiment” or “in an embodiment” invarious places throughout this specification are not necessarily allreferring to the same embodiment. Furthermore, the particular features,structures, or characteristics may be combined in any suitable manner inone or more embodiments.

[0017] In a system environment, the processor temperature is a functionof both system and component thermal characteristics. The system levelthermal constraints consist of the local ambient air temperature andairflow over the processor as well as physical constraints at and abovethe processor. The processor temperature depends in particular on thecomponent power dissipation, the processor package thermalcharacteristics, and the processor thermal cooling solution.

[0018] For many years, thermal cooling solutions were designed to ensurethat a processor never reached a maximum operating temperature. Inessence, they were designed for the worst-case scenario. These weretypically calculated based on a maximum processor power consumption,which directly relates to the amount of heat that needed to bedissipated by the processor (and thus the processor's temperature). As aresult, excess cooling margins were often provided, resulting in highercooling system costs, and greater audible disturbances (e.g., fannoise).

[0019] Recently, processors have been developed that include built-inthermal management features. For example, the Intel Pentium 4™ processorincludes a Thermal Monitor that is integrated into the processorsilicon. The Thermal Monitor includes a highly accurate on-dietemperature sensing circuit, a signal (PROCHOT#) that indicates theprocessor has reached its maximum operating temperature, registers todetermine status, and a thermal control circuit that can reduce theprocessor temperature by controlling the duty cycle of the processorclocks.

[0020] The processor temperature is determined through an analog thermalsensor circuit comprising a temperature sensing diode 100, a factorycalibrated reference current source 102, and a current comparator 104,as shown in FIG. 1. A voltage applied across the diode will induce acurrent flow that varies with temperature. By comparing this currentwith the reference current, the processor temperature can be determined.The reference current source corresponds to the diode current when atthe maximum permissible processor operating temperature.

[0021] The Thermal Monitor's thermal control circuit (TCC), when active,lowers the processor temperature by reducing the duty cycle of theinternal processor clocks. Typically, the TCC portion of the ThermalMonitor is enabled via the system BIOS. When active, the TCC turns theprocessor clocks off and then back on with a predetermined duty cycle.In one embodiment, an ACPI (Advanced Configuration Power Interface)register, performance counter registers, status bits in model specificregister (MSR), and the PROCHOT# output pin are available to monitor andcontrol the Thermal Monitor.

[0022] An exemplary processor clock-throttling scheme is shown in FIG.2. Under normal operations, the processor clock would cycle as shown bynormal clock waveform 202. However, in response to an asserted (low)PROCHOT# signal 200, an internal clock duty cycle control signal beginsto be switched on and off, as shown by waveform 204. The resultantinternal clock cycle is shown in waveform 206. The actual duty cyclewill vary from one product to another. Generally, cycle times will beprocessor speed dependent and decrease as processor core frequenciesincrease.

[0023] For testing purposes, the thermal control circuit may also beactivated by setting bits in the ACPI MSRs. The MSRs may be set based ona particular system event (e.g., an interrupt generated after a systemevent), or may be set at any time through the operating system or customdriver control, thus forcing the thermal control circuit on. This isreferred to an “on-demand” mode. Activating the thermal control circuitmay be useful for cooling solution investigations or for performanceimplication studies.

[0024] To minimize the cost of processor thermal solutions, systemdesigners are encouraged to take advantage of the Thermal Monitorfeature capability. The Thermal Monitor feature allows processor thermalsolutions to design to the thermal design power (TDP) target, as opposedto a maximum processor power dissipation level. Designing to the lowerTDP target results in a lower thermal solution cost, while stillmaintaining a level of processor performance that is virtuallyindistinguishable from systems designed to manage maximum powerdissipation levels.

[0025] Generally, the TDP target is determined as a function of theanticipated thermal stress load the processor will encounter, which inturn is a function of the application software run on the processor. Forexample, FIG. 3 shows a graph of CPU power vs. number of applications,which is illustrative of a typical power consumption vs. applicationtype distribution. Generally, each application program has its ownunique power profile, although the profile has some variability due toloop decisions, I/O activity and interrupts. The graph illustrates astatistical distribution based on averaged application powerconsumption.

[0026] In general, compute intensive application with a high cache hitrate dissipate more processor power than applications that are I/Ointensive or have low cache hit rates. This effect is depicted in thegraph, wherein the thermal stress (i.e., CPU power) resulting from mostapplications, such a productivity applications, is moderately low whencompared with computer intensity applications, such as games andscientific applications.

[0027] Typically, the processor TDP is based on measurements ofprocessor power consumption while running various high powerapplications. This data is used to determine those applications that areinteresting from a power perspective. These applications are thenevaluated in a controlled thermal environment to determine theirsensitivity to activation of the thermal control circuit. This data isused to derive the TDP targets published in the processor datasheet.

[0028] A system designed to meet such published TDP targets greatlyreduces the probability of real application causing the thermal controlcircuit to activate under normal operating conditions. Systems that donot meet these specifications could be subject to frequent activation ofthe thermal control circuit depending on ambient air temperature andapplication power profile. Moreover, if a system is significantlyunder-designed, there is a risk that the Thermal Monitor feature willnot be capable of maintaining a safe operation temperature and theprocessor could shutdown and signal a thermal trip point condition.

[0029] In accordance with aspects of the invention, a thermal solutionmay be validated by verifying the case-to-ambient thermal resistance,θ_(CA), as specified by the following equation:

θ_(CA)=(T _(C) −T _(A))/Power dissipated from case to ambient  (1)

[0030] where T_(C) is the case temperature of the processor, and T_(A)is the ambient temperature. In one embodiment, thermal solutionvalidation is verified by comparing a measured θ_(CA) a case-to-ambientthermal characterization parameter Ψ_(CA) specified for the processor,where

Ψ_(CA)=(T _(C) −T _(A))/Total Package Power  (2)

[0031] In one embodiment, the Total Package Power is the TDP targetvalue. Generally, if the measured θ_(CA) is less than the specifiedΨ_(CA), the thermal cooling solution is validated.

[0032] Ideally, θ_(CA) could be determined in the following manner.Execute TDP thermal “stress” code comprising various instructions andoperations designed to produce a thermal load (stress) on the processorcorresponding to a TDP power dissipation condition, and simply measurethe case and ambient temperature. A problem with this approach resultsfrom the fact that due to manufacturing variances, all processors do notconsume the same amount of power under the same thermal stress load. Forexample, an exemplary statistical distribution of actual processor powerconsumption under a common TDP thermal stress load (e.g., throughexecution of TDP thermal stress code) is shown in FIG. 4. As a result,the value for θ_(CA) would be dependent on the particular processor orprocessors that is/are tested. Another way to look at the implication ofthe FIG. 4 graph is that there is no single set of thermal stress codeinstructions that would induce a TDP power dissipation condition in allprocessors.

[0033] In accordance with further aspects of the invention, a moreaccurate result for θ_(CA) is obtained by measuring the temperature ofthe processor under a processor-independent TDP thermal stress level toproduce a TDP power dissipation level. Furthermore, in accordance withone embodiment of the invention, the entire process is done throughsoftware. In accordance with another embodiment, the entire process isperformed without requiring any extraneous thermal probes or powermeasurement probe components.

[0034] With reference to the flowchart of FIGS. 5A-5C, a process forperforming a software-only platform thermal solution validation inaccordance with one embodiment of the invention begins in a block 500 inwhich the current ambient temperature is obtained. In one embodiment,the ambient temperature is input by test personnel via a user interfaceprovided by a software application that is executed on a processor toperform thermal solution validation testing in accordance with theflowchart. In another embodiment, an ambient temperature value may beretrieved from a temperature measurement device that includes acommunications interface that is connected to the computer platform viaa standard communications cable, such as a serial or USB cable, andreceived over a corresponding input/output (I/O) port.

[0035] Next, in a block 502, the processor type, normal operationalfrequency, and thermal design power typical (TDP_(typ)) are determined,with the latter parameter adjusted for the ambient temperature.Typically, the processor type and frequency can be obtained through acall to the operating system. In another embodiment, such informationmay be obtained through retrieving corresponding data from processorregisters, such as MSRs. In a third embodiment, such information may bemanually entered by test personnel. Once the processor type andfrequency are determined, the TDP_(typ) value is retrieved from a lookuptable that maps processor types and frequencies with their correspondingTDP_(typ) values. The lookup table may include entries for variousambient temperatures, or an adjustment factor may be applied to adjust amanufacturer TDP_(typ) value (typically rated at 35° C.) to correspondsto an equivalent TDP_(typ) value for the ambient conditions. In oneembodiment, the processor type and frequency to be tested are known inadvance, and therefore such information does not need to be determinedat run-time as provided by block 502. In addition to the foregoingvalues, in one embodiment the maximum temperature for the processor,T_(max), is also retrieved.

[0036] After the processor type, frequency, and TDP_(typ) values areknown, an appropriate set or sets of thermal stress code instructions isidentified, and a baseline set of thermal stress code begins executionon the processor under test in a block 504. As discussed above, thermalstress code comprises instructions and operations designed to induce athermal load (i.e., thermal stress) on the processor. The terminology“sets of thermal stress code instructions” is indicative that in oneembodiment multiple instruction/operation sets are employed duringtesting to adjust the thermal stress on the processor, as describedbelow. Generally, it may be advantageous to create sets of thermalstress code instructions that are particular to a correspondingprocessor and/or processor/frequency combination; however, this is notrequired. Accordingly, in one embodiment processor-specific sets ofthermal stress code are employed, while in another embodiment a“generic” set or sets of stress code is employed. Following initiationof the execution of the thermal test code, the processor operatingfrequency is read in a block 506 to verify it is running at its designedfrequency.

[0037] The remaining operations pertain to run-time testing, which areselectively performed based on the result of various conditionals andloop logic defined by the flowchart. The primary loop, which is definedby start and end loop blocks 508 and 509, begins by reading theprocessor temperature. In accordance with one embodiment, the processortemperature is provided internally by the processor, in a manner similarto that discussed above, and the temperature value is stored in aplatform storage location, such as a register or memory location. In oneembodiment, the temperature can be retrieved from the register using anACPI-compliant operating system call. In another embodiment, thetemperature value is accessed via direct access to the register (i.e.,accessed directly by the testing application, without employing the OS).Both techniques are known in the art, and will generally be processortype and/or OS-type dependent.

[0038] Next, in a decision block 510 a determination is made to whethera thermal overload condition currently exists or was tripped. Detailsfor determining the result of this decision block are shown in FIG. 5C.In general, one or more of the conditional tests shown in FIG. 5C may beimplemented to make the determination. These include determining in adecision block 512 whether the processor temperature is greater thanT_(max). Another indication to whether a thermal overload conditionexists or was tripped is to monitor the processor frequency. If athermal overload condition is tripped, the processor's built-in logic(e.g., via the thermal control circuit) may “throttle” the processor byreducing its frequency. For example, built-in circuitry such as theforegoing Thermal Monitor may be used to sense a thermal overloadcondition, and throttle the processors execution speed via lowering theeffective processor frequency. In some instances, this throttling iseffectuated via the duty-cycle scheme of FIG. 2. In other cases, theprocessor frequency may be reduced while maintaining a constant dutycycle (e.g., through a frequency divider network or component). Ineither situation, various known techniques may be employed to determinethe effective processor frequency. This frequency test is depicted by adecision block 514.

[0039] As discussed above, Intel Pentium 4 processors assert a PROCHOT#signal when a thermal overload condition is detected while the ThermalMonitor is enabled. Accordingly, as depicted by a decision block 516, adetermination can be made to whether the PROCHOT# signal is asserted. Ifany of the determinations corresponding to decision blocks 512, 514, and516 are YES (TRUE), the thermal solution has failed, and a correspondingoutput is provided to the test operator and/or recorded in a block 518,whereupon the test is terminated.

[0040] If the operations of FIG. 5B return a NO (FALSE) result, thelogic proceeds to a block 520 in which the current processor powerconsumption is obtained. As described below in further detail, in oneembodiment a voltage regulator with built-in power measurementcapabilities and corresponding conversion circuitry is employed toprovide power to the processor. Accordingly, in this embodiment theprocessor power can be obtained by reading an appropriate register orstorage location in which a value indicative of the current processorpower consumption is stored. In another embodiment, the overall powerconsumption of the platform is measured via an external measurement, asdescribed below. The processor portion of this overall total is thenderived to obtain the current processor power consumption.

[0041] Moving forward to the portion of the flowchart at the top of FIG.5C, the next set of operations pertain to adjusting the thermal stressload (i.e., power dissipation) of the processor such that it ismaintained at or near TDP_(typ). In one embodiment, this is enabledthrough appropriate logic built into the stress code or other testapplication component in accordance with the following logic. In a block522, a determination is made to whether the current processorconsumption is less than, equal to, or greater than TDP_(typ). If theprocessor power is less than TDP_(typ), the thermal stress is too low,and thus the thermal stress is increased via selective execution of newset of stress code instructions that produce a higher thermal stresswhen executed. For example, in one embodiment, the thermal stress codecomprises a plurality of sets of thermal stress code instructions (i.e.,sets of respective instructions, operations, and branching logic, etc.),wherein each respective set of thermal stress code instructions producesa different level of power consumption, on a relative (non-absolute)basis, as illustrated in FIG. 8 and discussed in further detail below.At the same time, the identity of the set of thermal stress codecurrently executing is monitored. If the thermal stress is to beincreased, the set of thermal stress code that produces the next highestlevel of power consumption begins to execute in place of the previouslyexecuting set of thermal stress code, becoming the new current set ofthermal stress code , as depicted by a block 524.

[0042] In converse to the under-stressed condition, if the processorpower level is greater than TDP_(typ), the processor is thermallyoverstressed, and its power consumption should be reduced. Thus, the setof thermal stress code instructions that produces the next lower levelof power dissipation begins to execute, becoming the new current set, asdepicted by a block 526. Finally, if the current processor power levelis already at TDP_(typ), there is no need to change the current set ofthermal stress code, and thus the current set of code remains beingexecuted.

[0043] After the determination is made to which set of thermal stresscode should be executing, a determination is made in a decision block528 to whether the processor temperature is steady. Basically, theobjective is provide a processor thermal stress via the selectiveexecution of the sets of stress code such that the processor reaches assteady-state temperature in conjunction with the processor powerdissipation substantially matching TDP_(typ). In one embodiment, oncethis combination of conditions exists, the value for θ_(CA) can becalculated at the target TDP value. In another embodiment, if theprocessor can operate at TDP_(typ) without inducing a thermal overload,the thermal solution is validated. The θ_(CA) calculation, if employed,along with the issuance of a system passed message and/or recording isperformed in a block 530, completing the test. As provided by end loopblock 509, in the event the steady-state condition hasn't been reached,the process loops back up to repeat the foregoing operations definedbetween start loop block 508 and the end loop block. This process loopis continued until either a test failure occurs, or a steady-statetemperature at a TDP_(typ) power consumption level is obtained withoutinducing a thermal overload, which may also be used to indicate a testsuccess.

[0044] As discussed above, in one embodiment the thermal solutionvalidation process is handled entirely by software. This is enabled, inpart, by using a built-in (to the platform) means for measuring theprocessor power consumption. Such a power measurement means provides apower measurement output value that may be stored in a register or otherplatform storage location and accessed by the thermal solutionvalidation test software, thereby providing an accurate measurement ofthe current processor power consumption.

[0045] With reference to FIG. 6, in one embodiment a power measurementmeans is implemented via a voltage regulator 600 that supplies power toa processor 602. Oftentimes, voltage regulators and the like, whichprovide separate power outputs relative to a systems main power supply,are employed to ensure that the power provided to critical circuits suchas processors is very accurate, with a minimum amount of noise.Typically, the voltage regulator receives input power from the platformspower supply, as depicted by a power supply 606.

[0046] Voltage regulator 600 provides a regulated predetermined fixedvoltage V+ to CPU 602. For example, V+ may typically comprise 3.1-3.3volts for modern microprocessors, although voltages in the general rangeof 2.2-5.0 volts are also common, depending on the processor's circuitcomposition. Once the voltage is known, the amount of power consumed byprocessor 602 can be determined by measuring the current flow into theprocessor.

[0047] There are many well known schemes for determining current flowingthrough a wire. For example, in one embodiment a current sense resistor608 is employed to sense the current supplied to processor 602. Thevoltage drop across the sense resistor is measured by ananalog-to-digital (A/D) converter 610. The amount of current flowingthrough the resistor, and thus being supplied to the processor, can theneasily be obtained by known the resistance of the resistor. A valuecorresponding to the input current, or optionally input power(determined current times V+), is then provided to a system managementbus (SMBUS), via and interface (I/F) 612. The SMBUS is used for passingvarious system management-related information among system managementcomponents. It may also be used to store system management data, such asthe processor input current or input power in various registers, or in aspecial portion of a platform's RAM known as SMRAM. Thus, the processorinput current and/or power can be measured and stored in a periodicbasis, and retrieved from a storage location by the thermal solutionvalidation software to determine the current power consumption of theprocessor.

[0048] Additional details of the thermal management features are alsoshown in FIG. 6. These include a means for measuring the processortemperature, and means for providing the measured temperature toexternal components. In one embodiment, the means for measuring theprocessor temperature comprises an A/D converter 614, which is connectedso as to measure the voltage differential across temperature sensingdiode 100. The output of the A/D converter is then made available toother system components coupled to the SMBUS via an interface 616.

[0049] Generally, in order to implement the scheme of FIG. 6, anappropriately configured voltage regulator and interface circuitry mustalready exist for the particular platform being tested. In manyinstances, this will not be available. As a result, an alternate meansmust be provided for determining the processor power dissipation.

[0050] One embodiment of such an alternate processor power dissipationmeasurement means is shown in FIG. 7. In this instance, the amount ofpower consumed by the platform as a whole is measured, and then theportion of the power consumed by the processor is extracted from thetotal power consumption measurement. As discussed above, in manyinstances it is more difficult to design a thermal cooling solution fora laptop or notebook computer, since the airflow around the processor ismuch more restrictive than most desktop configurations. Furthermore, theuse of heatsinks in laptops and notebooks is very limited, so other heattransfer mechanism are employed.

[0051] Most of today's laptop computer' employ an external power supplythat is connected to an AC power source, such as depicted by a laptopcomputer 700, power supply 702, and AC power source 704 in FIG. 7. Thepower supply produces a DC output 708 that is received as input power tothe laptop. Typically, the input is received via a power conversioncircuit or component 710, which is used to supply power to the laptop'smotherboard 712, and other platform components, including peripheraldevices such as a floppy drive 714, a CD ROM drive 716, a hard disk 718,and drive circuitry to drive a display 720. Power is also provided toother circuits and components, such as one or more cooling fans,speakers, network interfaces, etc. The power conversion circuitry orcomponent is also typically coupled to a battery 722.

[0052] Generally, a processor 724 and one or more memory modules 726will be connected to motherboard 712. The motherboard will also includea plurality of integrated circuit (IC) components, such as chipsetcomponents, communication interface components (e.g., for networkinterfaces, serial and parallel ports, USB ports, etc), display drivecomponents, keyboard interface component for interacting with a keyboard728, and other IC's. The motherboard will typically also include variouspassive components (e.g., resistors, capacitors, etc.).

[0053] In accordance with the illustrated embodiment, a powermeasurement device 730, such as a clamp meter, is used to measure theinput power supplied to laptop computer 700. The power measurementdevice may perform a current measurement, or may directly provide apower measurement. For example, Fluke Corporation of Everett, Wash.manufactures several clamp meters that include voltage inputs, enablingsuch meters to provide a direct power measurement. In an optionalconfiguration, a custom power supply may be employed that providedbuilt-in power measurement capabilities (not shown). In such animplementation, the power supply should provide substantially the sameoutput voltage and current as the laptop's normal power supply.

[0054] The power measurement device provides a power measurement signal(current or input power) that is accessible to laptop computer 700 via acommunication interface 730, such as a serial interface or a USBinterface. Generally, the power measurement device can directly providesuch an interface, or it may be coupled to another device that providesthe interface. This enables the thermal solution validation software toretrieve the current power consumption of the laptop computer.

[0055] As above, the objective is to determine the processor powerdissipation on an ongoing basis. In order to obtain this, the amount ofpower consumed by the processor and the amount of power consumed byother platform components will be derived. Ideally, testing should beperformed in a manner that enables as many components as possible to beat rest, such as floppy drive 714, CD ROM drive 716, hard disk 718, etc.The idea is to determine a baseline power consumption level of theplatform (particularly the motherboard alone, if possible), and thencalculating the power use of the processor based on the change in powerconsumption during the test. Furthermore, testing should be performed ina manner in which the laptop's fan or fans are operating at a continuouslevel (e.g., off, at an intermediate level, or maximum level).

[0056] Generally, the largest consumers of power on a laptop are theprocessor, peripheral devices, and the display screen. In contrast, theamount of power consumed by the other components (besides the processor)on the motherboard is typically much smaller. Like the processor, theamount of power consumed by a given motherboard and peripheral deviceswill vary, primarily due to manufacturing variances of the variousmotherboard components. This variance will affect the accuracy of theprocessor power dissipation determination, since the total platformpower consumption, which will depend on both the processor and the otherplatform circuitry and components, is measured rather than the processorpower dissipation directly.

[0057] Exemplary Software Architecture

[0058] An exemplary software architecture diagram in accordance with oneembodiment of the invention is shown in FIG. 8. Generally, the softwarewill comprise an application program that includes a plurality ofcomponents. The components may be contained within the executable codefor the application program, or may comprise separate modules, such asdynamic link libraries, externally callable code segments, etc. Forillustrative purposes, a plurality of exemplary components are shown ascomprising portions of an application program 800.

[0059] Application program 800 includes a supervisor component 802, auser interface 804, a temperature, power, and processor monitoring block806, a processor thermal control loop block 808, and thermal stress code810. Generally, each of these components may be implemented as separatethreads, multiple components may be combined into a single threads, ormultiple threads may be employed for an individual component. Inaddition, application program 800 includes a lookup table 812 in whichprocessor parameters are stored, such as T_(max), TDP_(typ), etc.Typically a set of processor parameters will be stored for eachprocessor type and/or processor/frequency combination.

[0060] In one embodiment, the supervisor component 802 comprises aprimary thread that is responsible for controlling the overalloperations and interactions with the application program. The userinterface 804 is provided to enable test personnel to interact with theapplication, such as providing various inputs and display test results.

[0061] Operations and logic provided by the temperature, power, andprocessor monitoring block 806, processor thermal control loop block808, and thermal stress code 810 generally correspond to the operationsand logic discussed above with reference to the flowchart of FIGS.5A-5C. For example, operations provided by temperature, power, andprocessor monitoring block 806 include retrieving and/or monitoring datapertaining to processor and ambient temperature, processor powerdissipation measurements, and thermal overload condition measurementsand monitoring operations. Meanwhile, the processor thermal control loopblock 808 is responsible for adjusting the processor power dissipationlevel such that it substantially matches the target level, e.g., TDP.This is accomplished comparing the current measured or determinedprocessor power dissipation level with the target, and then adjustingthe thermal stress up or down accordingly based on selective executionof sets of thermal stress code instructions 814 n. Generally, the setsof thermal stress code instructions are designed to induce a thermalstress on the processor that is proportional to the processordissipation level target, wherein the sets are arranged to produceincremental changes in the thermal stress. For example, in theillustrated example, the increment is 4% of TDP. Accordingly, processthermal control loop 808 selects a new set of thermal stress codeinstructions to execute, if necessary based on the logic discussed abovewith reference to FIG. 5C.

[0062] Exemplary Platform on which Embodiments of the Invention may beImplemented

[0063] Returning to FIG. 7, laptop computer 700 is further generallyillustrative of various platforms that can be tested using embodimentsof the thermal solution validation software scheme disclosed herein. Forexample, a typical platform will include a motherboard having aprocessor, memory, chipset components, and other IC, which is disposedin a chassis. Generally, the chassis may comprise a laptop case, or maycomprise a desktop or server case. The platform will also includeperipheral devices, such as floppy drives, CD ROM drives, hard disks,etc., which are usually internally housed in the chassis.

[0064] The software comprising the thermal solution validation testapplication (and optional supporting modules) will typically be loadedvia a floppy drive or CD ROM and stored on a hard disk or otherpersistent storage device in the platform. Optionally, the test code maybe loaded at run-time from a floppy disk or a CD ROM. As yet anotheroptional loading mechanism, the software may be downloaded over anetwork.

[0065] The software application code, which will typically comprise aplurality of instructions and data, will be executed on the platformprocessor during the test. Generally, the instructions may comprisedirectly executable machine code, or may comprise instructionscorresponding to an intermediate language that is compiled and executedat run-time by an intermediate language engine, such as Java code, C#code, etc.

[0066] Thus, embodiments of this invention may be used as or to supporta software application executed upon a platform's processor or otherwiseimplemented or realized upon or within a machine-readable medium. Amachine-readable medium includes any mechanism for storing ortransmitting information in a form readable by a machine (e.g., acomputer). For example, a machine-readable medium can include such as aread only memory (ROM); a random access memory (RAM); a magnetic diskstorage media; an optical storage media; and a flash memory device, etc.In addition, a machine-readable medium can include propagated signalssuch as electrical, optical, acoustical or other form of propagatedsignals (e.g., carrier waves, infrared signals, digital signals, etc.).

[0067] The above description of illustrated embodiments of theinvention, including what is described in the Abstract, is not intendedto be exhaustive or to limit the invention to the precise formsdisclosed. While specific embodiments of, and examples for, theinvention are described herein for illustrative purposes, variousequivalent modifications are possible within the scope of the invention,as those skilled in the relevant art will recognize.

[0068] These modifications can be made to the invention in light of theabove detailed description. The terms used in the following claimsshould not be construed to limit the invention to the specificembodiments disclosed in the specification and the claims. Rather, thescope of the invention is to be determined entirely by the followingclaims, which are to be construed in accordance with establisheddoctrines of claim interpretation.

What is claimed is:
 1. An automated method for validating a thermalsolution for a computer platform, comprising: adjusting, throughselective execution of thermal stress code on a platform processor, anamount of power dissipated by the processor to substantially match atarget power dissipation level; and determining if the processor isthermally overloaded while operating at the target power dissipationlevel, wherein the thermal solution is validated if the processor is notthermally overloaded while operating at the target power dissipationlevel.
 2. The method of claim 1, wherein the determination to whetherthe processor is thermally overloaded is determined by performing theoperations of: internally measuring the temperature of the processor;and verifying that the internally measured temperature of the processordoes not exceed a maximum specified processor temperature.
 3. The methodof claim 1, wherein the determination to whether the processor isthermally overloaded is based on a determination to whether theprocessor is throttled while operating at the target power dissipationlevel.
 4. The method of claim 1, wherein the determination to whetherthe processor is thermally overloaded is based on detecting a signaloutput by the processor that indicates the processor is thermallyoverloaded.
 5. The method of claim 1, wherein the amount of powerdissipated by the processor is adjusted by performing the operations of:(a) executing an initial set of thermal stress code instructions; (b)determining a current amount of power dissipated by the processor; (c)comparing the current amount of power dissipated with the target powerdissipation level; (d) selecting a new set of thermal stress codeinstructions to execute based on the result of operation (c), whereinexecution of the new set of thermal stress code instructions increasesthe amount of power dissipated by the processor if the current amount ofpower dissipated is less than the target dissipation level or decreasesthe amount of power dissipated by the processor if the current amount ofpower dissipated is greater than the target dissipation level; and (e)repeating operations (b)-(d) until a steady-state condition is reached,wherein the current amount of power dissipated by the processorsubstantially matches the target power dissipation level.
 6. The methodof claim 5, wherein the current amount of power dissipated by theprocessor is determined through use of a built-in platform componentthat detects one of an input current or input power consumed by theprocessor.
 7. The method of claim 5, wherein the current amount of powerdissipated by the processor is determined by performing the operationsof: determining an amount of power dissipated by platform componentsexclusive of the processor during platform operations; measuring aninput power consumed by the platform; and subtracting the determinedamount of power dissipated by the platform components exclusive of theprocessor from the measured input power consumed by the platform toobtain the current amount of power dissipated by the processor.
 8. Themethod of claim 7, wherein the platform comprises a computer thatemploys an external power supply and the input power consumed by theplatform is measured by measuring a DC power input to the platform. 9.The method of claim 1, wherein the platform includes a chassis in whichcomputer electronics including the processor are housed and all of thethermal solution validation operations are performed via softwareexecuting on the processor without using any extraneous sensorscontained within the platform chassis.
 10. The method of claim 9,wherein the thermal solution validation is performed without using anycomponents that are extraneous from the platform.
 11. An automatedmethod for validating a thermal solution for a computer platform,comprising: adjusting, through selective execution of thermal stresscode on a platform processor, an amount of power dissipated by theprocessor to substantially match a target power dissipation level; andcalculating a thermal dissipation characterization value correspondingto the thermal solution while the processor is operating at the targetpower dissipation level; and comparing the calculated thermaldissipation characterization value with a specified thermal dissipationcharacterization parameter for the processor to determine whether tovalidate the thermal solution.
 12. The method of claim 11, wherein thecalculated thermal dissipation characterization comprises acase-to-ambient thermal resistance that is calculated from the equation,θ_(CA)=(T _(C) −T _(A))/Power dissipated by processor case to ambientwherein T_(C) is the processor case temperature and T_(A) is the ambientair temperature, further comprising, obtaining T_(A); obtaining T_(C),determining the current power dissipated by the processor; andcalculating the case-to-ambient thermal resistance.
 13. The method ofclaim 12, wherein T_(A) is obtained by entry of the ambient temperatureby test personnel via a software interface.
 14. The method of claim 12,wherein T_(A) is obtained by reading a temperature measurement devicehaving a probe disposed in the ambient air.
 15. The method of claim 12,wherein T_(C) is obtained via a temperature sensing circuit built intothe processor.
 16. The method of claim 11, wherein the amount of powerdissipated by the processor is adjusted by performing the operations of:(a) executing an initial set of thermal stress code instructions; (b)determining a current amount of power dissipated by the processor; (c)comparing the current amount of power dissipated with the target powerdissipation level; (d) selecting a new set of thermal stress codeinstructions to execute based on the result of operation (c), whereinexecution of the new set of thermal stress code instructions increasesthe amount of power dissipated by the processor if the current amount ofpower dissipated is less than the target dissipation level or decreasesthe amount of power dissipated by the processor if the current amount ofpower dissipated is greater than the target dissipation level; and (e)repeating operations (b)-(d) until a steady-state condition is reachedwherein the current amount of power dissipated by the processorsubstantially matches the target power dissipation level.
 17. The methodof claim 16, wherein the current amount of power dissipated by theprocessor is determined through use of a built-in platform componentthat detects one of a input current or input power level consumed by theprocessor.
 18. The method of claim 16, wherein the current amount ofpower dissipated by the processor is determined by performing theoperations of: determining an amount of power dissipated by platformcomponents exclusive of the processor during platform operations;measuring an input power consumed by the platform; and subtracting thedetermined amount of power dissipated by the platform componentsexclusive of the processor from the measured input power consumed by theplatform to obtain the current amount of power dissipated by theprocessor.
 19. The method of claim 18, wherein the platform comprises acomputer that employs an external power supply and the input powerconsumed by the platform is measured by measuring a DC power input tothe platform.
 20. The method of claim 11, wherein the platform includesa chassis in which computer electronics including the processor arehoused and all of the thermal solution validation operations areperformed via software executing on the processor without using anyextraneous sensors contained within the platform chassis.
 21. The methodof claim 20, wherein the validation of the thermal solution is performedwithout using any components that are extraneous from the platform. 22.A machine-readable media having instructions stored thereon includingthermal stress code, which when executed by a platform processorperforms a validation of the platform's thermal solution by performingthe operation of: adjusting, through selective execution of the thermalstress code by the processor, an amount of power dissipated by theprocessor to substantially match a target power dissipation level;determining if the processor is thermally overloaded while operating atthe target power dissipation level; and outputting a thermal solutionvalidation test result, wherein the thermal solution is validated if theprocessor is not thermally overloaded while operating at the targetpower dissipation level.
 23. The machine-readable media of claim 22,wherein execution of the instructions determines whether the processoris thermally overloaded by performing the operations of: retrieving aninternally measured temperature of the processor for a storage location;and verifying that the internally measured temperature of the processordoes not exceed a maximum specified processor temperature.
 24. Themachine-readable media of claim 22, wherein execution of theinstructions determines whether the processor is thermally overloaded byperforming the operation of monitoring a processor register in whichprocessor speed information is stored to determine whether the processoris throttled while operating at the target power dissipation level. 25.The machine-readable media of claim 22, wherein execution of theinstructions determines whether the processor is thermally overloaded byperforming the operation of monitoring a processor output that indicatesthe processor is thermally overloaded when asserted.
 26. Themachine-readable media of claim 22, wherein execution of theinstructions adjusts the amount of power dissipated by the processor byperforming the operations of: (a) executing an initial set of thermalstress code instructions; (b) reading a value indicative of a currentamount of power dissipated by the processor from one of a storagelocation or input/output port; (c) comparing the current amount of powerdissipated by the processor with the target power dissipation level; (d)selecting, if necessary, a new set of thermal stress code instructionsto execute based on the result of operation (c), wherein execution ofthe new set of thermal stress code instructions increases the amount ofpower dissipated by the processor if the current amount of powerdissipated is less than the target dissipation level or decreases theamount of power dissipated by the processor if the current amount ofpower dissipated is greater than the target dissipation level; and (e)repeating operations (b)-(d) until a steady-state condition is reached,wherein the current amount of power dissipated by the processorsubstantially matches the target power dissipation level.
 27. Amachine-readable media having instructions stored thereon includingthermal stress code, which when executed by a platform processorperforms a validation of the platform's thermal solution by performingthe operation of: adjusting, through selective execution of the thermalstress code by the processor, an amount of power dissipated by theprocessor to substantially match a target power dissipation level;retrieving thermal variables pertaining to a thermal environment for theplatform and calculating a thermal dissipation characterization valuebased on those thermal variables while the processor is operating at thetarget power dissipation level; comparing the calculated thermaldissipation characterization value with a specified thermal dissipationcharacterization parameter for the processor to determine whether tovalidate the thermal solution; and outputting a thermal solutionvalidation test result based on the results of the comparison betweenthe calculated thermal dissipation characterization value and thespecified thermal dissipation characterization parameter. .
 28. Themachine-readable media of claim 27, wherein the calculated thermaldissipation characterization value comprises a case-to-ambient thermalresistance that is calculated from the equation, θ_(CA)=(T _(C) −T_(A))/Power dissipated by processor case to ambient wherein T_(C) is theprocessor case temperature and T_(A) is the ambient air temperature, andexecution of the instructions further performs the operations of,obtaining T_(A) from one of a value entered via a computer interface orreading a value produced by a temperature measurement device via aplatform input/output (I/O) port; reading one of a platform storagelocation or I/O port to obtain T_(C), reading a value indicative of acurrent amount of power dissipated by the processor from one of aplatform storage location or I/O port; and calculating thecase-to-ambient thermal resistance.
 29. The machine-readable media ofclaim 27, wherein the processor provides a thermal management mode, andexecution of the instructions generates a command to enable the thermalmanagement mode.
 30. The machine-readable media of claim 27, whereinexecution of the instructions adjusts the amount of power dissipated bythe processor by performing the operations of: (a) executing an initialset of thermal stress code instructions; (b) reading a value indicativeof a current amount of power dissipated by the processor from one of astorage location or input/output port; (c) comparing the current amountof power dissipated by the processor with the target power dissipationlevel; (d) selecting, if necessary, a new set of thermal stress codeinstructions to execute based on the result of operation (c), whereinexecution of the new set of thermal stress code instructions increasesthe amount of power dissipated by the processor if the current amount ofpower dissipated is less than the target dissipation level or decreasesthe amount of power dissipated by the processor if the current amount ofpower dissipated is greater than the target dissipation level; and (e)repeating operations (b)-(d) until a steady-state condition is reached,wherein the current amount of power dissipated by the processorsubstantially matches the target power dissipation level.